Aggregated Estimators and Empirical Complexity for Least Square Regression
نویسنده
چکیده
Numerous empirical results have shown that combining regression procedures can be a very efficient method. This work provides PAC bounds for the L2 generalization error of such methods. The interest of these bounds are twofold. First, it gives for any aggregating procedure a bound for the expected risk depending on the empirical risk and the empirical complexity measured by the Kullback-Leibler divergence between the aggregating distribution ρ̂ and a prior distribution π and by the empirical mean of the variance of the regression functions under the probability ρ̂. Secondly, by structural risk minimization, we derive an aggregating procedure which takes advantage of the unknown properties of the best mixture f̃ : when the best convex combination f̃ of d regression functions belongs to the d initial functions (i.e. when combining does not make the bias decrease), the convergence rate is of order (log d)/N . In the worst case, our combining procedure achieves a convergence rate of order p (log d)/N which is known to be optimal in a uniform sense when d > √ N (see [10, 15]). As in AdaBoost, our aggregating distribution tends to favor functions which disagree with the mixture on mispredicted points. Our algorithm is tested on artificial classification data (which have been also used for testing other boosting methods, such as AdaBoost).
منابع مشابه
Parameter Estimation Through Weighted Least-Squares Rank Regression with Specific Reference to the Weibull and Gumbel Distributions
Least squares regression based on probability plots, also called rank regression, can be used to estimate the parameters of some distributions. Regression is performed between a function of the empirical distribution function and the order statistic as the independent variable. Using large sample properties of the empirical distribution function and order statistics, weights to stabilize the va...
متن کاملEstimation in multiple regression model with elliptically contoured errors under MLINEX loss
This paper considers estimation of the regression vector of the multiple regression model with elliptically symmetric contoured errors. The generalized least square (GLS), restricted GLS and preliminary test (PT) estimators for regression parameter vector are obtained. The performances of the estimators are studied under multiparameter linear exponential loss function (MLINEX), and the dominanc...
متن کاملRobust Estimation of Multiple Regression Model with Non-normal Error: Symmetric Distribution
In this paper, we develop the modified maximum likelihood (MML) estimators for the multiple regression coefficients in linear model with the underlying distribution assumed to be symmetric, one of Student's t family. We obtain the closed form of the estimators and derive their asymptotic properties. In addition, we demonstrate that the MML estimators are more appropriate to estimate the paramet...
متن کاملThe Ratio-type Estimators of Variance with Minimum Average Square Error
The ratio-type estimators have been introduced for estimating the mean and total population, but in recent years based on the ratio methods several estimators for population variance have been proposed. In this paper two families of estimators have been suggested and their approximation mean square error (MSE) have been developed. In addition, the efficiency of these variance estimators are com...
متن کاملShort Term Load Forecasting Using Empirical Mode Decomposition, Wavelet Transform and Support Vector Regression
The Short-term forecasting of electric load plays an important role in designing and operation of power systems. Due to the nature of the short-term electric load time series (nonlinear, non-constant, and non-seasonal), accurate prediction of the load is very challenging. In this article, a method for short-term daily and hourly load forecasting is proposed. In this method, in the first step, t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004